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[57] ABSTRACT 

A high-speed large quantity data transfer system is realized 
using a synchronous control means having striped disks in 
which a synchronizing cost between disks is reduced by 
software control, using time-out means, and using a data 
transfer means where user memory is not used. An input- 
output processing system for inputting and outputting a large 
quantity of data comprising a logical disk control means, the 
logical disk control means comprises a performance data 
collection means for collecting data, a logical disk construc- 
tion means for constructing a logical disk apparatus using 
said plurality of disk apparatus, where said logical disk 
construction means forms Logical disk management data so 
that the stripe width is set in order to equalize response time 
needed for input and output corresponding to one stripe data 
of each disk apparatus constructing said logical disk appa- 
ratus; and said logical disk control means controls said 
logical disk apparatus by said logical disk management data. 

18 Claims, 9 Drawing Sheets 
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APPARATUS FOR FORMING LOGICAL 
DESK MANAGEMENT DATA HAVING DISK 
DATA STRIPE WIDTH SET IN ORDER TO 
EQUALIZE RESPONSE TIME BASED ON 
PERFORMANCE 

BACKGROUND OF THE INVENTION 

1. Field of the Invention 

The present invention relates to an input-output process- 
ing system in a computer system, in particularly to an 
input-output processing system for inputting/outputting a 
large amounts of data in a high-speed. 

2. Description of the Prior Art 

In recent years, a disk subsystem called Redundant Array 
of Inexpensive Disks (RAID) spreads in personal computers 
and workstations, according to miniaturization and inclina- 
tion for low pices of magnetic disk storage devices. RAID 
is a logical disk apparatus comprising a plurality of disk 
apparatus in order to improve performance and reliability. In 
general, any of construction systems divided into six levels, 
from RAID 0 level to RAID5 level, is adopted according to 
the purpose of use. Generally. RAID is realized as a function 
of disk controller and it is often realized as a function of 
SCSI host adaptor in a personal computer. 

On the other hand, it is also known that a logical disk 
apparatus can be constructed by a plurality of disk apparatus 
controlled by a software instead of a hardware. Typical 
products using such technology are, such as. Logical Volume 
Manager of IBM company and Volume Manager of Veritas 
company. These products have functions such as disk 
concatenating, disk mirroring and disk striping. Some prod- 
ucts are attempting to cover all of RAID functions. 

In RAID system, RAID 3 level or a RAID 0 level is used 
in order to input and output rapidly a large quantities of data. 
A RAID 3 level is constructed as a byte interleave construc- 
tion in which data layout construction has parity data to be 
added. RAID 0 level is equivalent to the disk striping by 
software where the parity data is not added to. 

Two input-output performances could be expected by 
adopting disk striping and RAID. One improvement of 
input-output performance for users which is provided by 
allocating input-output requests to respective disk apparatus 
and operating these disk apparatus In parallel when a data 
size of input-output is requested which could contain a 
plurality of disk apparatus for interleave-arranged data. 
Another is an improvement of a through-put performance of 
a system which is provided by distributing the concurrent 
access to logical disk apparatus of a plurality of disk 
apparatus. In order to input and output rapidly a large 
quantities of data, the former improvement is pursued in the 
present invention. 

By the way. although input and output for the logical disk 
apparatus constructed by stripe are distributed to respective 
disk apparatus, it is necessary to synchronize all inputs and 
outputs distributed to the logical disk apparatus for every 
input and output access. If there is a difference of processing 
time between respective logical disk apparatus operating in 
parallel, a waiting time is needed to finish the processing 
according to the difference. Therefore, it causes deteriora- 
tion of input-output performance of a logical disk apparatus. 
For factors of disturbing the synchronization, there are 
performance differences of respective disk apparatus, a seek 
(positioning), dynamic performance differences caused by 
rotational waiting and que of data bus. and so on. 

In order to decrease overhead of waiting time between 
disk apparatus in RAID system, for example, in a system to 
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which RAID 3 level is applied, it is known that the system 
is constructed by homogeneous disk apparatus and all disk 
rotations are synchronized $n as to avoid the disturbance of 
synchronization caused by rotational waiting of disks, and 

s that sector positions which arrange data are shifted little by 
little so as to absorb a delay of input-output instructions to 
respective disk apparatus. However, this technique requires 
an assumption that RAID could be designed as a single 
hardware subsystem. (MICHELLE Y. KIM. "Synchronized 

10 Disk Interleaving". IEEE TRANSACTIONS ON COM- 
PUTERS. VOL. C-35. NO. 11, NOVEMBER 1986) 

Further, from more precise point of view, a prior art is 
disclosed in the laid-open Japanese patent publication No. 
5-27910. where location gap correction of head of disk 

is apparatus is regarded as a factor of disturbing synchroniza- 
tion between disk apparatus. In the prior art when a location 
gap correction is needed in a certain reference disk, com- 
mands are instructed to every disk apparatus comprising 
disk arrays to perform location gap correction in order to 

20 minimize synchronization disturbance caused by location 
gap correction. 

In the laid-open Japanese patent publication No. 
5-257611. it is disclosed that a logical disk apparatus is 
realized in a flexible construction by software control in a 

25 RAID system. In other words, there is disclosed a data 
layout method, where logical disk apparatus having different 
RAID levels are mixed flexibly on one or a plurality of disk 
arrays, for partitioning of disk array, and performance dete- 
rioration by disturbing synchronization between disk appa- 

30 ratus is decreased. 

As described above, a logical disk apparatus constructed 
by software control has been disclosed in the prior arts. But 
any logical disk apparatus has not been constructed based on 
a performance data obtained by measuring the disk appara- 

35 tus comprising a logical disk apparatus in order to construct 
the most suitable logical disk apparatus. 

The present invention provides an efficient input-output 
processing system to improve a through-put of the entire 

^ system. More concretely, the object of this invention is to 
improve the performance of input-output system by con- 
structing the most suitable logical disk apparatus using 
ordinary disk apparatus instead of using RAID. 
In processing a large quantities of data such as image data, 

45 input-output data quantity at one time becomes very large. In 
this case, the invention provides an input-output processing 
system which receives less data than specified quantity and 
can start its processing earlier, instead of waiting for all 
specified large quantity of data being transmitted entirely in 

50 response to one input-output instruction. 

Further, the invention provides an input-output system 
which makes it possible to transmit data efficiently between 
input-output units. 

55 SUMMARY OF THE INVENTION 

According to one aspect of the invention, an input-output 
processing system for inputting and outputting a large 
quantity of data consisted of a logical disk control means, 
where the logical disk control means includes a performance 

60 data collection means for collecting data from performance 
data where performance characteristics of a plurality of disk 
constructing input-output system are given by a system 
manager or from direct measurement of the performance by 
operating the disk; a logical disk construction means for 

65 constructing a logical disk using all disks on the basis of 
performance data collected by the performance data collec- 
tion means, where the logical disk construction means 
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produces logicai disk management data w«<* «ts wMtt jto ^^£SS^J^!£^ 

^tSr^ of the invention, the input- 

the logical disks by the logical disk management data. 5 J^^^^^^ iapaaing and outputting a large 

According to further aspect of the invention the input- JWE. whTn an kput-ou^t request is 

output processing system for inputting and outfitting a large quanti* ' «« ^ ^ ^logical disk control 

quantity of data, further comprises a time limit setting means dynam icaUy judges the state of head position of disk 

for setting a limit time required to operate the input-output ^ sh0It i y MaK sending an input-output instruction 

unit at sending input-output instruction to input-output units; , 0 t **J disk apparatus constructing the logical disk 

and a means for completion of input-output operation which apparatlIS , and $en ds the input-output instruction so that a 

is started by the input-output instruction when the set time ^ fw carrying mput and output to the logical disk 

limit passed. apparatus becomes shortest _ 

According to further aspect of the invention, the input- According to further aspect of the invention, the input- 
output processing system for inputting and outputting a large i s output processing system for inputting and outputting a large 

According to further aspect oi ine invention, u«= iu^u t ftirthpr a<;nect Q f the invention, the input- 

quantity of data, further comprises a time limit setting means »4 CTdn wh ^ an input-output request is 

for setting limit time in relation to a P^^« *^£J^£^ t ^&*^te^ 

inpu^utput instruction at sending mpt.t^ ms^cuon sen f s generates to each of a 

t°"P»W»^^ M*-J£ *e logical disk control means 
completing input^utout operation which .u ^J* ™ 30 P ™ £ j^,^", instruction so that the input-output 

input-output instruction when the set time bmit ^ "^^.J disk apparatus becomes shortest by 

According to further aspect of fce »ventton^ toe input- umeto * c J* ta U.put-output request 

output processing system for inputting and outputting a large f 1 ^* , 7,,"^ di , k ^aratuses 

quL^* data: wherein the performance measurement to £ invention, the iq»t- 

response time by setting conditions in which the response di sk control means 

time for each disk apparatus is the shortest or longest ^Sto^plEE a logicS disk apparatus which 

According to farther aspect of the ^entton the input- riv?* Ae systlm manager, 

output processing system for inputting and °*P«^ e " Ac^g to a£ect of the invention, the input- 

quantity of data: wherein the performance 40 0U ^^ssLly«em to inputting and outputting a large 

means carries out performance measurement of Ac disk ^g^^^ aa ^ ses a data transfer driver for 

apparatus as a part of ^"^J^^jL^ SSS outTutonsfer cTrrol to input-output units, the 

means collects bus performance data by construction of disk "PP™* 5 . tf fte , t . 
input-output bus which a plurality ^^.^"^ ouro ™ 6 S sL^mto^tttog and outputting alarge 
connected to. system construction data to which tte system "**££^2^ ^ B a bus arbittr for control- 
manager gives bus transfer performance, or °P«"» 50 q Wj^™^SdS ax ^nput-output unit; the 
of an input-output unit connected to the bus by the perfor- hng X^^i a^o^Veeistei and a destination 
mance LasurTment means; the logical disk constmcton ^^^'^ZZl^c ID and data 
^coiicnidi.lotl^dtti^^fte^rfto ^ ZSSSSw. respectively; and the data transfer 
collected performance ^^S^cl perfor ^^^ t ^^ f ^^ utusingsystemnKmo ry 
mance connected to each ^pef^ f 53 ^^Zuie bus arbiter when an apparatus for sending the 

,„..*, or to rtml._l. cte l"StoC ""aSX Co fmtbci aspect of Che toootlo.. cho top* 

mll .c.o.rf«,««tcyfctovoyu^^c^c. Jgg£££S^ lvm ^ Mf * H .* v , 

^«^,tm^p°*«?«r«'t*»i « "SS"S?S«.of*ob.,»«».choi W o.- 
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. i i**-. „ f muv armaratus may be connected to each SCSI bus 112 through 

quantity of data: further comprising a plurality of dak JJ™^ 6 ,^ 

LaraTus comprising the logical disk apparatus which are a ^ COQStruct ion of the present 
connected to different input-output buses »■ - .^-^ ^ FIG. 2 . a control program 201 carries out data 
plurality of logical disk apparatus, further C ^P™*S a from , i ogica l disk apparatus to high-speed dev.ee 
means arranged between disk apparatuses connected to the J ^ from W|M)eed device 108 to disk 
same input-output bus at a disk control %^J™*- apparatus Ul or logical disk apparatus. This control pro- 
trolling respective disk apparatus connected to the respective ^ ^ M a user application program. In this 
input-output buses, in order to carry out data copy between ^ bodiment , ^ II operating system 202 is mainly 
different logical disk apparatus. wort and functions of its operating system are hardly 

BRIEF DESCRIPTION OF TOE DRAWINGS -f — ^SM^'iS 

no 1 shows an example of a hardware construction of a WQrds ^ contro , p^pam 201 only instructs to write data 
system which an inpuumtput processing system of the f 3fJ m from „ offset i MB of device A to device , RMa 
present tavention is applied to. „ transfer driver 203 is incorporated into a kernel as a pseudo 

KG 2 Ms another example of a software construction 15 driver. A volume manager 204 constructed by a logicaltfck 

provided by a performance measurement means of the 20 V^^^J^ manager 204 constructs a 

present invention. f In0icfll ^sk control means). A disk driver 205 (a 

FIG. 4 shows a flowchart of processes fi55«)l-**W-» 

logical disk apparatus automatically which .s earned out by part 01 1 uog^ ^ ^ ^ rf & ^ 

a volume manager of the present invention. *• ^ ^ device 108. When a data 

FIG. 5 shows an example of logical disk management data „ fan ^ on of tas 8^,^ 105 is utmzed. dau transfer 

of the present invention. driver 203 carries out control function directly. The logical 

FIG. 6 shows an example of logical disk construction date ^ means CMtBlh not only a logical disk apparatus 

of the present invention. but also input-output units other than a logical disK appa- 

FIG. 7 shows a construction of a disk controller of the 

present invention. 30 Here, a function of volume manager 204. which , is the 

the present invention. input-output by a stripe construction of the logical disk 

DETAILED DESCRIPTION OF THE * apparatus. In other words, volume manager 204 ~»^^ » 

^SS^HvffiODIMEKrS ,o P S«d disk apparatus using ^.apparaWs 

FKtit-bK^ 0 f volume manager 204 according to the instruction of a 

EMBODIMENT 1 sys|em manager. A user accesses the logical disk apparaO" 

apparatus, like a sort poceH^ud outputs me processing £^^^ n conducts interface when accessing die logi- 

result again to the disk apparatus. ^j^k apparatus using logical disk construction data 

Further, four disk apparatus tu: having the same %J^g£%&* dlkTonstruction means 213. 

s^ctionsarec^nnectedtoeachSCSIbusmthroughaiUsk 6ener*ed.^op ^ ^ stripin g „ summarized as 

controller 110. f . n _ ' To ^ £ t easily understood, assuming that two 

^srs^-s 5SSI* sa- -u- . — — « 
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input-output bus which is high-speed enough. Consecutive 
large quantities of data having a certain size is divided to be 
a size of, for example, 128 sectors (this quantity of data unit 
is called 'chunk', hereafter). Chunks are stored alternately in 
respective disk apparatus in order from the first When an 
input-output instruction for two-chunk input-output size is 
requested at the top of logical disk apparatus constructed by 
stripe, the input-output request is distributed into two disk 
apparatuses and processed. Therefore, apparent input-output 
performance becomes double. 

It is explained in the following example that a system 
manager needs a disk apparatus whose performance is twice 
as good as a usual disk apparatus, and realizes it by utilizing 
two usual disk apparatuses and the volume manager 204. A 
system manager introduces the volume manager 204 into a 
system. The volume manager 204 gives an instruction for 
introducing two new disk apparatus to the disk introduction 
means 207. and registers it under control of volume manager 
204. Receiving the request for registration, the volume 
manager 204 initializes the disk apparatus 111 and writes a 
management metadata (logical disk management data 210) 
to the disk apparatus 111 for constructing a logical disk 
apparatus. At this state, a logical disk apparatus is not 
constructed yet Further, the system manager requests the 
logical disk construction means 213 to construct a logical 
disk apparatus constructed by stripe in the volume manager 
204 using two disk apparatus. Logical disk construction 
means 213 forms logical disk management data 211. 
thereby, a user is enabled finally to utilize a logical disk 
apparatus. When a user accesses to the logical disk 
apparatus, he accesses to a logical disk input-output inter- 
face 212 through the operating system 202. Receiving the 
input-output request volume manager 204 refers to logical 
disk construction data in the logical disk apparatus, and 
gives an input-output instruction to an actual disk apparatus 
111 which comprises a logical disk apparatus. 

Operations of this input-output processing system are 
explained orderly below. To begin with, a performance 
measurement of a disk apparatus unit and a measurement of 
bus efficiency are explained, which are carried out when a 
disk apparatus is introduced into a system 

First, they are summarized as follows. These measure- 
ments carry out assumption of response performance 
according to the input-output size of disk apparatus, 
response performance according to disk address and disk 
cash effect and so on. by using performance measurement 
means 208 (a part of performance data collecting means), to 
know performance features of respective disk apparatus 
constructed by stripe. Concretely, input-output operation of 
the disk apparatus is carried out using the parameters, and a 
response time (a part of performance data 209) is measured 
and the result is recorded. On the basis of performance data 
209 of the respective disk apparatus, data quantities which 
should be allocated for respective disk apparatus, that is, a 
stripe width is determined in order to make each response 
time equal which is required for input-output data of one 
stripe of disk apparatus comprising a disk stripe. According 
to the resultant data, the logical disk construction means 213 
constructs the logical disk construction data 211. that is. a 
logical disk apparatus of striping construction. Input -output 
requests to the logical disk apparatus of such a construction 
are distributed to respective disk apparatus comprising a 
stripe. Respective input-output units equally responses in 
every certain time. Therefore, roughly speaking, apparent 
response time is reduced to a ratio of one to the number of 
disk apparatus constructing stripes, in comparison with the 
response time for merely mputting-outputting to disk 
apparatus. 



8 

Volume manager 204 judges whether requested perfor- 
mances of the logical disk apparatus are ensured or not 
when a logical disk apparatus construction is requested by a 
system manager, on the basis of this measured performance 

5 data 209. Introduction of disk apparatus is carried out by a 
instruction of format (initialization), after starting up the 
system after disk apparatus is connected, and then after 
confirmation of disk connection. By using format data as 
writing data, disk performance measurement and initializa- 

10 tion of a disk apparatus can be both carried out It does not 
carry out a mere format but also metadata (disk management 
data 210) is written so that the disk apparatus is controlled 
under the volume manager 204. FIG. 3 shows a result of 
performance measurement which is carried out to disk 

15 apparatus 111. The numerals in FIG. 3 show the above 
mentioned metadata which are stored in the disk apparatus. 
In other words, they are referred to as a snapshot. 

In FIG. 3. a reference name 501 specifies the disk appa- 
ratus 111 in a system, index information 502 specifies the 

20 type of disk apparatus 111 . The volume manager 204 obtains 
this information by reading out the data base, given by a 
system manager, which are included in catalogue data 
relating to disk apparatus 111. or by reading out label 
information of disk apparatus 111. This information tells 

25 disk rotation speed and the number of sectors per one truck 
on each cylinder. In FIG. 3. the numeral 503 denotes 
capacity of disk apparatus 111. the numeral 504 denotes an 
access time without seek and rotational waiting time, and the 
numeral 505 denotes an access time indicated in micro 

50 second when seek is carried out from the far most distant 
position with one rotational waiting time. This measurement 
is carried out by measuring the pass time between the 
completion of access to the reference position and the 
completion of access to the next sector or the far most distant 

35 sector. Ten data from 506 to 507 expresses bit numbers in 
KB/S which are obtained by measuring transfer rate at the 
positions from zero cylinder in every 100 MB. These disk 
r>erforrnancc data 209 is referred to or utilized when logical 
disk is constructed as explained later. It is assumed that there 

40 is no difference between reading performance and writing 
performance for the disk apparatuses which are used in this 
embodiment. 

In general, when a new disk apparatus is introduced into 
a system, a process called format is carried out for finding 

45 bad sectors and writing label information. In this 
embodiment this format process may be carried out at the 
same time with a process for collecting performance data 
209 of a disk apparatus by performance measurement means 
208. as described above. Therefore, necessary cost for 

50 measurement performance could be reduced, and a burden 
of system manager could also be reduced. 

The following is the explanation about measurement of 
through-put performance of an input-output bus. The object 
of this measurement is to construct a logical disk, consid- 

ss ering an input-output performance of disk apparatus, a data 
transfer performance (a part of performance data 209) of 
input-output bus which could be a factor to prevent syn- 
chronization operation, and a conflict on the same bus. 
Input-output units including all disk apparatus respectively 

60 connected to a plurality of input-output buses are combined 
to be driven and be loaded. The response time of input- 
output operation of each input-output unit is measured. 
Thereby, it is known if there is any redundant ability for 
transferring data at each input-output bus and also at higher 

65 hierarchy input-output buses to which the former input- 
output bus is connected, or what the limit value of the 
transfer ability is. A logical disk apparatus is constructed by 
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selecting a disk apparatus constructed by stripe, in order to of stripes and stripe width. In other words, performance 
avoid a data transfer ability limitation of common input- improvement by disk striping can not be obtained if an 
output bus by a data transfer of disk apparatus operating in input-output size is not a multiple of the above mentioned 
parallel. Thereby, performance deterioration caused by chunk size. 

input-output bus neck when data are inputted and outputted 5 In step 602, the necessary number of disk apparatus which 
to a logical disk apparatus, can be avoided. operate in parallel to achieve performance of an appointed 

Accordingly, this measurement defines not only a mere logical disk apparatus, in other words, the WAY number of 
through-put performance but also conditions of input-output stripe is determined, according to the performance data 209 
bus conflict generation to an input-output unit connected to of disk apparatus obtained by the performance measurement 
a measurement object bus according to the starting order. l0 means 208. After the termination of the WAY number, 
and actually generates a bus conflict to measure input-output vacant areas on the disk apparatus which is controlled by this 
latency of each input-output unit. Thereby, it is possible to input-output processing system is searched at step 603. and 
know bus arbitration policy and to insert a starting delay in determines whether it is possible to construct stripes on the 
order to avoid unnecessary bus conflict generated at starting disk apparatus which is different by necessary WAY num- 
the input-output of disk apparatus comprised of the stripes. l5 bers. However, disk striping is not always needed in order to 
In this embodiment, however, through-put performance of satisfy the appointed performance reference. If it is not 
SCSI bus 112 and local bus 107. to which a disk apparatus necessary, vacant areas are simply searched on the disk 
is connected, is measured, and the stripe is constructed not apparatuses. After then, at step 604. it is judged by the given 
to exceed the limit of the through-put performance of each input-output size if there is any effects in performance, or if 
input-output bus at accessing input and output of the logical ^ the requested performance reference does not need any disk 
disk apparatus comprised of stripes. Regarding SCSI bus striping, when constructing a logical disk apparatus having 
112, the limit value of the through-put performance can be stripe construction, whose chunk size is equal to the input- 
obtained by running all input-output units connected to the output size. In case of no effect in performance is found by 
bus 112 (the same four disk apparatuses are connected to the constraint of stripe width, or no disk striping is 
SCSI bus 112 in the same way in this embodiment), and by ^ necessary, disk areas are reserved and an expected perfor- 
calculating total amount of input-output data obtained within mance of logical disk apparatus is reported to a user at step 
a certain period of time. By comparing the total amount 610 before completion. 

obtained in the above processes with the through-put per- When disk striping of more than two WAYs is effective, 
forraance in case that one disk apparatus operates, we know it is judged at step 605 if there is any possibility of stripe 
how many disk apparatuses can operate without being ^ construction by the same kind of disk apparatus. If it is 
limited by bus transfer rate. In this embodiment, since the possible, it is judged at step 606 if there is any performance 
maximum transfer rate of the disk apparatus is 2 MB/s and difference between inner and outer circumferences of disk, 
the maximum transfer rate of SCSI bus 112 is 8 MB/s. the If there is performance difference between inner and outer 
operation is not simply limited by bus transfer rate if the four circumference of disk, it is judged at step 607 if there is any 
disk apparatus connected to SCSI bus 112 operate at the 33 possibility of using disk apparatus, which constructs stripes, 
same time. from the same offset position. If it is possible to use from the 

Similarly, according to measurement of transfer rate of same offset position, disk apparatus are constructed by stripe 
local bus 107 by operating every disk apparatus and the alternately to the reverse direction at step 608. If the result 
high-speed device 108. it is known that the transfer rate is no as indicated **n" at step 605, step 606 and step 607, the 
corresponds to the total of transfer rate of all input-output 40 stripe is constructed in order to equalize the response speed 
units at 40 MB/s. Eventually, no matter how input-output of respective disk apparatus by stripe width, at step 608. 
operation is carried out to every input-output unit, a con- Operations in step 609 is further supplementarily 
struction of an input-output bus and an input-output unit of explained as follows. Disk striping is constructed as follows, 
this embodiment is not limited by transfer rate of the First chunk size is determined. An input-output size is 
input-output bus. 45 regarded as chunk size unless the appointed input-output 

However, if it is different from this enibodiment. for size is larger than the amount of stripe width necessary to 
example, if transfer rate of SCSI bus is slower than the achieve logical disk apparatus performance. On the other 
transfer rate of SCSI bus of this embodiment, and only two hand, when an input-output size is too large, input-output 
disk apparatuses can operate by the transfer rate of SCSI bus size divided by natural number is regarded as chunk size 
at the maximum transfer rate of the disk apparatus, more 50 according to the necessity. Now, assume that 512 KB is 
than three disk apparatuses which comprise the same logical appointed for input-output size. In this case, for example, 
disk apparatus constructed by stripe could not be constructed chunk size becomes 128 KB. which is a quarter of 5 12 KB. 
by disk apparatus connected to the same SCSI bus. if it is possible to achieve performance by 4 WAYs of 

Next, a construction method of a logical disk apparatus is average 32 KB stripe width. Data transfer rate at the position 
explained as follows. FIG. 4 shows a flow chart which 55 of disk apparatus constructing stripes is obtained from 
explains operations of a logical disk construction means 213 performance data 209 of disk apparatus which is already 
comprising a logical disk apparatus constructed by stripe. obtained. The chunk is divided by the inverse ratio of the 
The method is explained referring to FIG. 4. In step 601, a data transfer rate to get the stripe width. Thereby, response 
user, in other words, a system manager inputs necessary time of disk apparatus constructing respective stripes is 
logical disk apparatus performance of stripe construction. 60 averaged when input-output operation is carried out to the 
for example. 8 MB/S. input-output size which an application logical disk apparatus. 

program outputs in case of accessing the logical disk Next, supplementary explanation of operations in step 
apparatus, for example, 512 KB. and necessary size of the 608 is as follows. Disk striping in step 608 is constructed as 
logical disk apparatus, for example. 1 GB. Thereby, this follows. In this case, chunk size and stripe width of each disk 
input-output processing system tries to construct a logical 65 apparatus are also deterrnined first Stripes are constructed 
disk apparatus which satisfies a specification. Here, the alternately from the top and from the end of the same usage 
input-output size is specified in order to restrict the number area of disk apparatus constructed by stripe of even number. 
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The stripe width is equivalent to one track width. The size 
of one track at the head position and the tail position is 
known by the number of sectors included in the tracks a! the 
position shown by performance information 502 of disk 
apparatus, and chunk size is determined such as (size of the 
head track+size of the tail track)x(nuraber of stripe WAYs/ 
2). Accordingly, the stripe width is equivalent to the size per 
track at the time of input and output. 

When logical disk construction data 211 for constructing 
a logical disk apparatus is generated and registered as 
described above, performance of this logical disk apparatus, 
which can be expected to be recommended input-output 
size, in other words, chunk size determined inside, is 
reported at step 610 before completion. When disk striping 
is not necessary according to performance level and input- 
output size given beforehand or when improvement of 
performance is not expected, necessary input-output size 
which satisfies the given performance level and a single disk 
performance is reported to a single disk apparatus. 

FIG. 5 and FIG. 6 show partial management information 
of a logical disk apparatus as constructed described above 
and respective stripe-constructed disk apparatus comprising 
a logical disk apparatus. The copies of these management 
information are stored as metadata on every disk apparatus 
111 controlled by this input-output processing system, and 
are read on memory 102 controlled by mis input-output 
processing system according to the necessity. FIG. 5 and 
FIG. 6 show general tabular formats for convenience. But 
these information are stored on actual disk apparatus and 
memory according to a template controlled by the program. 

A logical disk apparatus constructed according to step 608 
and management information of stripe-constructed disk 
apparatus, shown in FIG. 5 and FIG. 6. are explained as 
follows. FIG. 5 includes a logical volume name 701 of a 
logical disk apparatus, which is accessed by the logical 
volume name 701 from an application program. A volume 
size 702 shows a volume size of the logical disk. A volume 
type 703 shows a volume type of the logical disk which is 
4 WAY stripe construction in the present example. A usage 
region 704 shows a disk apparatus constructing stripes and 
areas of the disk apparatus, which are respectively specified 
by the names. cOtOdO-0. clt0d0-0. c2t0d0-0 and c3t0d0-0. In 
order to be easily understood, a disk apparatus is specified 
by c0t0d0-0. and an area number is specified by -0. 
However, they are logical expression, but not identify the 
addresses on the hardware. Further. FIG. 6 shows area 
information of disk apparatus shown by usage region 704, 
mat is. region information specified by index c0t0d0-0. FIG. 
6 includes the region name 801. physical disk name 802 of 
a physical disk apparatus where the region is arranged, 
starting offset 803 of the region, region size 804. data 
arrangement direction 805. stripe width of the region 806. 

Next, an operation is explained when an input-output 
operation is requested to a logical disk apparatus having 
management information (logical disk construction data 
211) shown in FIG. 6 and FIG. 5. When accessing from 
application program (control program 201 is also a kind of 
application program) to the present logical disk apparatus, 
chunk size as input-output size, and an offset value for the 
logical disk apparatus are given. In other words, the appli- 
cation program can inquire of a volume manager 204 about 
chunk size beforehand. When volume manager 204 receives 
the inquiry request, it reports attributes, such as chunk size 
of the appointed logical disk apparatus. When this input- 
output processing system receives the input-output request 
the system refers to management data, that is. logical disk 
construction data 211 of the logical disk apparatus, and 
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knows that the logical disk apparatus is constructed by 4 
WAY stripe construction as shown in the usage region 704. 
When the system is informed by the region information that 
this logical disk apparatus is constructed by stripe, the 

5 system calculates how many chunks of the stripes corre- 
spond to the offset. For a region arranged reverse direction, 
in other words, for a reverse directional region, an accumu- 
lated total track size which is corresponding to the number 
of offset chunks from the tail region is regarded as an offset 

10 and the track size is regarded as an input-output size. Then, 
the system gives an input-output instruction to the respective 
disk apparatus comprising stripes. For a forward directional 
region, offset is calculated similarly from the top of the 
region, and an input-output instruction is given by regarding 

15 the size of the track as an input-output size. For all regions 
constructing a logical disk apparatus, input and output 
control is started from the far-distant head position of the 
disk apparatus maintained by the disk driver 205. and input 
and output operation to the logical disk apparatus is com- 

20 pleted after waiting the completion of the all input-output 
operation to the disk. 

As already explained, a truck size of each position inside 
the region is obtained by the database which this input- 
output processing system provides for every kind of disk 

25 apparatus. 

According to this embodiment a performance collection 
means collects performance features of disk apparatus con- 
nected to the system, by data given by the system manager 
or by performance measurement means. A logical disk is 

30 constructed by logical disk construction means on the basis 
of the collected data, and the logical disk is operated by 
logical disk control means. Therefore, response time, needed 
to input-output per stripe of data of each disk comprising 
logical disk, is averaged to improve logical disk perfor- 

35 mance. 

When new disk apparatus is introduced into the system, 
the disk apparatus can collect the performance data 209 by 
performance measurement means 208. combined with for- 

^ mat processing which is always carried out to find bad 
sectors and to write label information. Therefore, a burden 
of system manager can be reduced as well as the cost for 
collecting the performance data 209. 
Further, input-output operation of the maximum access 

45 time, the nunimiuri access time, and transfer rate of each 
disk apparatus actually mounted to the system are actually 
carried out to respective disk apparatus. The response time 
is measured, and a logical disk is constructed using a table 
having the measured data. Therefore, the most suitable 

so logical disk apparatus can be constructed. 

Further, the logical disk apparatus is constructed, consid- 
ering data transfer performance of input-output bus and the 
competition on the same bus which could be factors to 
prevent input-output performance and synchronous opera- 

55 tion of disk apparatus. An operation of the system is carried 
out by using the data transfer performance of input-output 
bus and construction data given by system manager, or by 
combining input-output unit including every disk apparatus 
connected to a plurality of input-output buses by the per- 

£0 formance measurement means. Thereby, the system is 
loaded, and the response time in input-output operation of 
each input-output unit is measured. According to the 
measurement it is decided whether there is any margin for 
the data transfer ability in each input-output bus. or in a 

65 higher input-output bus to which the input-output bus is 
connected, or what the limit value of transfer ability is. A 
disk apparatus having logical disk construction constructed 
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by stripe is selected, by preventing the limit of data transfer into one input-output instruction. This data chaining func- 

abiiity of the common input-output bus by the data transfer tion is realized, by providing a scattering/gathering function 

of disk apparatus which is operating in parallel, at input and to disk driver 205 and SCSI bus adapter 109 while data are 

output to the logical disk apparatus constructed by stripe. transferred to/from the memory, and using a DMA list 

Therefore, performance deterioration can be avoided, which 5 in a logical disk apparatus explained in the first 

is caused by an input-output bus neck at the input-output embodiment, gathering a plurality of input-output instruc- 

operauon for the logical disk apparatus. tions does not always improve the performance since the 

The data per a chunk constructing stripes of logical disk logical disk apparatus includes a region where data arrange- 

apparatus is arranged to be included in one track of the disk ment direction is reversed to the seek direction. Therefore, 

apparatus. Thereby, a means is provided in order to average i° this gathering of input-output instructions is not carried out 

head seek cost (corresponding to the time for positioning) by in the second embodiment. If the logical disk apparatus in 

the sequential access regardless of the seek direction. the first embodiment is a logical disk apparatus constructed 

Therefore, more flexible disk drive could be constructed by the step 609 explained in FIG. 4. that is. a logical disk 

because the difference becomes small even if seeking opera- apparatus which equalizes response speed between disks by 

tion is carried out from outer to inner of the circumference is the stripe width, an input-output bus has enough transfer 

or if seeking operation from inner to outer of the ability to be regarded as advantageous for this embodiment 

circumference, except disk cash effect at reading out the Therefore, gathering a plurality of input-output instructions 

data. on the same disk apparatus is carried out. 

Further, when a plurality of disk apparatuses having mnraxnumwr <x 

performance difference between inner and outer of the 20 EMBODIMENT 3 

circumference are constructed by stripe, the stripe is con- The third embodiment realizes high-speed processing in 

structed so that inner and outer circumference are combined m input-output processing system of the present invention, 

alternately. Thereby, uniform input-output response time can 7 shows an implementation means which is arranged in 

be obtained, regardless of read/write positions of the logical con troller 110. FIG. 7 includes a SCSI control portion 

disk apparatus constructed by stripe, 302, a disk portion 304 and a timer portion 305. In this 

Further, when there are input-output requests to a logical embodiment, the timer portion 305 is arranged in the disk 

disk apparatus constructed by stripe, instruction order of controller 110. However, it is also possible to use the system 

input-output to all of necessary disk apparatus is controlled timer 103 connected to memory bus 104 in place of the timer 

in order to finish input-output operation as the logical disk ^ portion 305. 

apparatus within the shortest period of time, considering the First operations of disk driver 205 and disk controller 110 

distance from the present disk head position to the access m explained as follows. Disk controller 110 can receive 

position requested and input-output size having advanta- time-out value to be set in timer portion 305 (a part of time 

geous performance of disk apparatus constructed by each Umii an-angement means) by SCSI bender unique command 

stripe. Therefore, efficiency of the input-output operation is ^ ^ a part of timc arrangement means), from disk driver 

improved as the logical disk apparatus. 205 via SCSI bus adapter 109, and SCSI control portion 302. 

Further, since elements such as WAY number and stripe After the time-out value is set in a count register (not shown 

width comprising a logical disk apparatus are generated m piG. 7) inside the timer by the command, the timer starts 

automatically, burdens of a system manager is reduced, and counting down as soon as it receives read or write command 

the most suitable logical disk is constructed easily. ^ from an initiator (generator of command). When the count 

miDAmv/uMT o register indicates zero, timer portion 105 stops counting 

EMBUUlMETVl l dQWn rcports t0 SCSI control portion 302 that the 

This embodiment carries out a plurality of input-output counting value became zero. Receiving this report, SCSI 

operations together to the same disk apparatus, when an control portion 302 asserts abort lines of controller internal 

input-output request is sent by the input-output size includ- 45 bus. What kind of phase it might be. disk control portion 304 

ing a plurality of chunks to the logical disk apparatus records a interruption time phase and transferred data quan- 

constructed as explained in the first embodiment This tity into an internal register (not shown in FIG. 7) which 

embodiment is based on the assumption that input-output could be referred from SCSI control portion 302 before it 

performance of logical disk apparatus could be improved as stops the operation. If SCSI control portion 302 is discon- 

a whole, if the input-output operation to the logical disk 50 nected from SCSI bus 112. it moves to a status phase after 

apparatus can finish earlier, by sending a plurality of input- reconnecting from the initiator, and sends the transferred 

output instructions to the same disk apparatus together at data quantity (status information) as a message. If the timer 

one time, when an input-output request is sent by the is still in operating when the whole data transfer is 

input-output size including a plurality of chunks to the completed, SCSI control portion 302 resets the count reg- 

logical disk apparatus constructed by stripe. 55 istex to zero. 

However, according to disk apparatus constructing a Next, operations of this embodiment are explained in 

logical disk apparatus, and a priority determination system detail. Disk driver 205 operates as follows when the time-out 

of the input-output bus connected to the disk apparatus, or value is set When demanding time-out operation at data 

characteristics of device driver which drives the disk transfer to disk driver 205, user program of disk driver 205. 

apparatus, it does not always improve the whole perfor- 60 that is, data transfer driver 203, or volume manager 204 in 

mance to send a plurality of input-output instructions to the this embodiment sets a time-out value by using the field of 

same disk apparatus together, In such case, input-output time block device control table which is common to the system (a 

for a logical disk apparatus is improved by inserting a table for controlling a device which transfers data by block 

synchronizing point which synchronizes operation between unit as disk apparatus do), where disk driver 205 could be 

disk apparatuses positively according to the necessity. 65 used freely when disk driver 205 calls data transfer subrou- 

In this embodiment, a data chaining function is needed tine. When the time-out value is set. disk driver 205 trans- 

since a plurality of input-output instructions are gathered mits the time-out value and makes a request for sending the 
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above mentioned bender unique command to SCSI bus 
adaptor 109 before disk driver 205 makes a request for 
sending a read/write command to SCSI bus adaptor 109. 
After a request for sending and the read/write command to 
SCSI bus adaptor 109 is completed, the disk driver 205 
becomes a sleeping condition until the completion of pro- 
cessing is notified by interruption from SCSI bus adaptor 
109. After being notified the completion of processing, disk 
driver 205 resets the time-out value in the above mentioned 
control table to zero, and reports the transferred data quan- 
tity as the data transfer quantity (status data) which is 
prepared in the control table. 

Next, a concrete example of the embodiment which is 
applied to a logical disk apparatus constructed by stripe is 
explained as follows. 

By the way, an input-output processing system in the 
present embodiment uses a disk apparatus constructed by 
stripe in order to accelerate the processing speed by control 
program 201. provides high-speed device 108 with data, and 
writes an output data from high-speed device 108 into a 
logical disk apparatus. In order to utilize the processing 
ability of high-speed device 108 effectively, it is considered 
that reducing latency before processing starts is more sig- 
nificant than data quantity actually transferred, especially 
when the data transfer is started. Since a large quantity of 
data is originally dealt with, the whole performance is not 
influenced by increase the number of input-output operation. 

Control program 201 provides a time-out value and 
input-output size to a logical disk apparatus constructed by 
stripe via data transfer driver 203 as required. When the 
volume manager 204 receives the time-out value via the 
above mentioned block device control table, the volume 
manager 204 sets the time-out value received from data 
transfer driver 203 into a time-out field of the control table 
prepared in each disk apparatus, when input and output of 
the disk apparatus constructed by stripe is started 

FIG. 8 shows a state such that the volume manager 204 
synchronizes with the four disk apparatus after time-out is 
happened and receives data transfer quantity from each disk 
apparatus, when the volume manager 204 sets a time-out 
value and reads out the logical disk apparatus constructed by 
stripe using four disk apparatus. FIG. 8 includes a first disk 
apparatus 901, a second disk apparatus 902, a third disk 
apparatus 903 and a fourth disk apparatus 904. The vertical 
direction of respective disk apparatuses shows a data trans- 
fer flow of each disk apparatus. The horizontal dotted line 
shows data quantity of one chunk of the logical disk. 
Accordingly, in this example, four chunks of data has been 
transferred to a logical disk apparatus constructed by stripe 
as an input-output size. 

In an input-output processing system of the present 
invention, a plurality of instructions to the same disk appa- 
ratus could be gathered if possible and effective, as already 
explained in the second embodiment. In mis FIG. 8. four 
chunks of input-output instructions are transferred together 
to each disk apparatus, and a time-out value is transferred to 
the input -output instructions. 

As already explained, disk driver 205 and SCSI bus 
adaptor 109 have data tuning functions. Therefore, there is 
no problem even if memory address of destination or source 
is not continuous (in FIG. 8. assuming that flat surface is 
memory spaces, then the horizontal direction is continuous). 

Transfer state of each disk apparatus 901-904 shown in 
FIG. 8. after volume manager 204 synchronized with the 
four disk apparatuses, are not filled except the fourth disk 
apparatus 904 after time-out is happened. However, since 
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any disk apparatus has completed data transfer until the third 
chunk shown in 905. the volume manager 204 reports to the 
data transfer driver 203 that three chunks of each data 
transfer have been completed. This embodiment shows that 

s there is a possibility that data quantity becomes zero if the 
time-out setting is too short Therefore, it is effective for 
users to give a plurality of chunks of logical disk apparatus 
which sends input -output request as an input-output size, 
and give a proper corresponding time-out value. 

10 Although, the input-output operation is completed by 
setting time limit in this embodiment, almost all data trans- 
fers are synchronized as shown in FIG. 8. 

When read/write is carried out by input-output size of a 
plurality of chunks, if gathering inputs and outputs to the 

15 same disk apparatus constructed by stripe together is not 
carried out because of its effectiveness, volume manager 204 
synchronizes with every chunks one by one between disk 
apparatus. When input-output instructions are sent to every 
disk apparatus, an upper appointed time-out value is set 

^ However, when the input-output is sent after the second 
chunk, and the passing time is measured by a system timer, 
from the time when sending the input-output instruction of 
the first chunk to the time when completing the synchroni- 
zation. The passing time is reduced from the time-out value. 

M and the reduced time-out value is set as a time-out value 
when the input-output instruction of the second chunk is 
sent. Hereafter, this is repeated until the time-out value 
becomes zero. An example of time setting by using SCSI 
bender unique command is shown in this embodiment 

^ However, bender unique command does not always have to 
be used, nor does SCSI interface. 

As described above, in the present embodiment a time 
limit setting means are arrange which sets time limit of 
input-output operation* and a completion means are 

35 arranged which complete the input-output operation when 
the time limit is over. If the predetermined time limit is over 
before completion of input-output operation of the prede- 
termined data quantity, then input-output operation is 
completed, and transferred data quantity during the limited 
time is reported at the time. Therefore, processing delay 
caused by waiting for input-output synchronization could be 
avoided. 

The input-output control apparatus includes timer 
function, a means for completing input-output operation. 

45 and a status report means at input-output interrupting (report 
of transferred data quantity). Therefore, the function is 
easily realized. 

In a logical disk apparatus comprising a plurality of disk 
apparatuses, in case of operating these plurality of disk 

50 apparatus in parallel, synchronization of completion of 
input-output between the disk apparatus is taken by setting 
time limit of input-output operation. Thereby, input-output 
processing for mis logical disk apparatus can be completed 
until the time desired by a user. In other words, the system 

55 sends input-output instruction to each disk apparatus com- 
prising a logical disk apparatus by setting the same time 
limit given from the upper process. When the input-output 
operation is completed and response is received from every 
disk apparatus, input-output operation completion statuses 

60 of respective disk apparatuses are reported together to the 
upper process. Thereby, although completion states of input- 
output operation of respective disk apparatus may be dif- 
ferent from each other, however, it is possible to complete 
input-output operation for a logical disk apparatus until the 

65 time user desired. - 

Further, a synchronizing method explained in this 
embodiment is applied to a logical disk apparatus con- 
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struct ed by stripe, as a synchronizing method between a 
plurality of disk apparatus. Thereby, the response time of 
input-output operation to a logical disk apparatus is guar- 
anteed by the time limit At the same time, if small difference 
of completion status of input-output operation is set that is. 
appropriate data quantity according to the response time is 
set. input-output operation gets more likely to be completed 
in every disk apparatus. 

EMBODIMENT 4 

This embodiment provides further realization for high- 
speed processing in an input-output processing system of the 
present invention. In the fourth embodiment a mechanism 
for carrying out high-speed data transfer between input- 
output units is explained, in the input-output systems 
explained in the above mentioned embodiments 1-3. 

Conventionally, when data are transferred between the 
input-output units, data is read from the transferring source 
into data space of user program which carries out data 
transfer control, and the read data is written in the apparatus 
of the transfer destinations. The memory space of user 
program, controlled by the memory of the system, which has 
given a burden to memory control of the system during the 
data transfer. DMA operation sometimes may not be carried 
out from the devices at the data space of user program, 
which has made the data transfer rate decreased. It is 
possible to prevent performance deterioration caused by 
giving burden to memory control, by transferring the data of 
this memory space by allocating buffer memory area, to 
which the data is transferred, and which is not concerned 
with the system, on a physical memory of the system, and by 
controlling from control process which is carried out in 
kernel space. 

FIG. 9 shows an example of bus arbiter 105 for realizing 
this high-speed transfer between devices which do not 
contain logical disks. In FIG. 9, source register 401 and 
destination register 402 are respectively are mapped at the 
memory space on the system, where ID on local bus 107 of 
high-speed device 108 is registered. Bus arbitration logic 
circuit 403 controls local bus. 

FIG. 10 is a flow chart which shows a process how data 
transfer driver 203 carries out high-speed transfer between 
input-output units except a logical disk. FIG. 11 is a flow 
chart which shows a process how bus arbiter 105 carries out 
high-speed transfer between input-output units including a 
logical disk. 

When control program 201 supplies a transfer source 
device, classification thereof, a data offset, a transfer desti- 
nation device, a classification thereof, a data offset, and 
transfer data quantity, the data transfer driver 203 operates 
according to the logic shown in a flow chart of FIG. 10. The 
operation of the data transfer driver 203 is explained refer- 
ring to FIG. 10. In step 1001. data transfer driver 203 judges 
whether logical disk apparatus is included or not in the 
source device or destination device, by the device classifi- 
cation (any of an ordinary device, a high-speed device, or a 
logical disk apparatus) received from control program 201. 
If a logical disk apparatus is included, a data transfer 
function of bus arbiter 105 is not utilized because an 
interface is not arranged in bus arbiter 105 of a volume 
manager and disk driver. Instead, the device is controlled in 
step 1004 utilizing memory 102 in the system which is 
controlled by data transfer driver 203, and device drivers 
(disk driver 205. high-speed device driver 206). 

A process shown in step 1004 is explained as follows. By 
the way. data transfer driver 203 keeps a static memory area 
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for data transfer processing on the system. This memory is 
kept as a page, for example, continuously on 2 MB physical 
address, which is not »n object for paging, when the system 
is initialized, and mapped on a virtual space as a kernel 

5 space. It is utilized for a two-face buffer to carry out data 
transfer. If the chunk size of a logical disk apparatus 
constructed by stripe is 128 KB, it is possible to assign an 
input-output size of 8 chunks. This processing is character- 
ized by carrying out asynchronous double buffer processing 

10 in one context provided by control program 201. A logical 
disk apparatus is controlled by volume manager 204. and 
controls input and output of a plurality of disk apparatus 
having asynchronous interface. However, as already 
explained, asynchronous devices are mutually synchronized 

15 because the input and output result of logical disk apparatus 
must be synchronously returned to the user. Therefore, when 
data transfer driver 203 accesses to a logical disk apparatus, 
it has to call and wait until the processing is completed 
Accordingly, in order to carry out asynchronous operation of 

20 the two-face buffer, this processing is carried out as follows. 



(When the transfer source is a logical disk] 

bom bgical disk apparatus to the first buffer 

25 -> Writing from the first buffer to a high-speed device (asynchronous) 
I 4 

I Reading from logical disk apparatus to I the second buffer 
Hi 

I o Waiting tor writing from the first buffer to the high-speed device 

10 4 

30 I P Writing Com the second buffer to the high-speed device 
(asynchronous) 
I 4 

I Reading from logical disk to the first buffer 

4- Waiting for witting from the second buffer to the high-speed device 
[When the transfer source is a high-speed device] 

Reading from the high-speed device to the first buffer (asynchronous) 

Waiting for reading from the high-speed device to the first buffer 

4 

Reading from the high-speed device to the second buffer 
(asynchronous) 
«° I 4 

I Writing from the first buffer to the logical disk 

11 4 

I o Waiting for reading from the high-speed device to the second buffer 
lo 4 

I p Reading from the high-speed device to the first buffer (asynchronous) 
45 I 4 

I Writing from the second buffer to the logical disk 
I 4 

t~ Waiting for reading from a high-speed device to the second buffer 



50 On the other hand, if a data transfer object device does not 
include any logical/disk apparatus, that is. data transfer 
between high-speed devices or data transfer between disk 
apparatus in which disk striping is not effective and high- 
speed devices. In this case, data transfer is carried out using 

55 bus arbiter 105 having data transfer function. The operation 
of this case is explained as follows. In step 1002. in case 
where data transfer driver 203 judges that there is no 
difference of transfer rate between devices, that is, data 
transfer between high-speed devices, is explained as fol- 

60 lows. Since high-speed device 108 has DMA controller, it is 
possible to carry out data transfer without intervening the 
memory 102 by handshaking using bus arbiter 105. In step 
1005, the data transfer driver registers ID (device address) 
on the local bus 107 of high-speed device 108 which could 

65 be transfer source or transfer destination, to the source 
register 401 and destination register 402 of bus arbiter 105 
mapped in the memory space of the system. Further, the data 
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■ « • EMBODIMENT 5 
transfer driver sends a read request to the high-speed device 

108 of transfer source by a predetermined input-output size ^ ^bo^m is an example of providing data transfer 

and by an address of unused physical memory space given Mch device on control local bus 107 whicn W>*™™> 

bv the system beforehand. The data transfer driver also ^ 0B me ^ via memory 102. to 

*nds a size and an address, and sends a write requestto 5 Bje|esf ^ t0 mcm0 ry 102. This is realized 

high-speed device 108 of transfer destination. Thereby, bus controUing DMA controller of each device in the input- 

acquisition requests are sent to the output systen? constructed in embodiments 1-4. needless to 

Where, the ID is registered in source register 401 and transfer rate between devices has to accord with 

destination register 402. Further, since any busy bi ^thisbrt say^ embodiment a logical disk apparatus .s 

is explained below) of .memory 102 is °« '^.^ .0 £ constructed to have four stripe disks in each dtferen 
register 401 and destination register 402. the -.bat disk atroaratus on each SCSI bus. However, arrangement of 

ttansfer driver 203 keeps area of memory 102. which s » structed by srtpc . where a plurality of SCSI buses are 

Suo^but not divided into another pag« on the physi- ^ ^ * ^ bus (local bus), a pluraUty 

cal address, for input-output sire. In step 1003. each © and apparatus are connected to each SCSI bus. 

meS bit flag are logically ORed and regjsleredon «i g. 00BMrted to a single disk apparatus, 

source register 401 and destination re»^ 402 ot ous ^ a wscs ^ con- 
fer l<*%ter then, ii .step »^^SS t obS M neSea to ttiTs^me bus with the same stripe width, 

gives transfer date size and res^^ SSore when data transfer between logical disk appara- 

Sevices of source register 401 •^J^^K^ SuSS MhedUk apparatus connected the same bus 

and sends read/write requests to both of them. Since mere ts tus is ™ °" ^ uie corresponding 

renuest until the bus request is accepted. When bus arbiter ^ of i^^tpat units havmg a 

?05 coSs toat dau ttnste completion bit is ON for bus ^ ^ „ by a pluraUty of input-output ports which 

free and at the data transfer completion of has a SCSI controller function. 

bus arbiter 105 asserts grant signals in step U06_ Thereby. mentioned respective embodiments. SCSI is 

me destination device writes date of the transfer soiree. JJJ£d^ ^ le of an input-oujput interface, 

which is transferred to memory 102 an interface does not need to be SCSI espedally. 

into own device. Thereby, the diu burfer « «"* «£ ^ Jg^S to say. other kinds of interface could be used. 

date transfer by keeping two-face buffer for data transfer, it 45 between disk apparatuses. 

" iw , tn cmv oa t faster data transfer between input- wh &t i s daimed is: . 

STwch^es little burden to memory control of the ^ Jf ^ quantlty of data «qani *^£ 
sv«em in other words, which can prevent performance ^ means< said logical disk control means comprising, 
demdation caused by a burden on the memory control of » „ rfornlanc e data collection means for collecting data 
£e system, from performance data where performance character- 
Farther in this embodiment, the system controls DMA rf , plutality of disk apparatus constructing 
controller' arranged in a control unit which controls each input-output system are given by a system manager** 
uZwuVputVnit and providing a controller at mam input- ^ ^ measurement of the performance by oper- 
outout bus. the controller synchronizes data transfer of » ^ ^d disk apparatus; 

respective input-output units on the bus without via any construction means for construebng a 

X Therefore, it is possible avoid to transfer data to ^ apparatus using sai d pluraUty of disk 

memory unnecessarily. apparatus on the basis of performance data collected by 

In order that the control program which cames l out data said perfcrnuuice data collection means, 

transfer (a date transfer driver) absorbs velocity <Wto»« 60 ^ ^ Mnsttuctioll means forms logical 

between input-output units, the system reserves memory for management date so mat the stripe width is set in 

data transfer controller (bus arbiter) connected to mam equalize response time needed for input and 

input-output bus. Therefore, it is possible for datattansfer corresponding to one stripe data of each disk 

cTtroUer (bus arbiter) to regulate transfer rate between ^cttag said logical disk apparatus; and 

devices utilizing the memory. Accordingly. h»gh-sp^data 65 ^ mMns cooBoto ^ logical disk 

transfer between input-output units can be cameo oui ap ^mi by said logical disk management data, 
efficiently. 
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2. An input-output processing system for inputting and 
JlSL. , \*J. nuantitv of data of claim 1. further 

TErnit setting means for setting a limit time required 
"T£S! Tsaid input-output unit at sending .nput- 5 
output instruction to input-output units; and 

started by said input-output instruction when the set 

Pr-sing syster, , to inputting^ » 
oujutting a large quantity of data of claim 2. further 

TtirnTr for setting time set by said time limit setting 
means; . ^ 

a means for completing input-output oP^o^y receiv- 
ing time expiration information from the timer. 

4 An input-output processing system to inputung and 
oututting a large quantity of data of claim 2: wherein 

Sd Sut-ou'put unit is a logical disk apparatus com- M 
nrised of a plurality of disk apparatus. 

5 P Anfnput output processing system for inpu«*ga£ 
putting a large quantity of data of claim 1. farther 

to a processing of an input-output mstructoon at ending 
input-output instruction to input-output units, and 
a orocessing completion means for completing wput- 
c^ratioTwhich is started by said tapuwutpu. 
insttuction when the set time limit is expired. » 

outputting a Urge quantity of data of 
Taid performance measurement means performs an input 
^ut instruction to measure the respond „ 
setting conditions in which the response time for each 
disk aooaratus is the shortest or longest. 
7 ^inp^utput processing system for 1«W ■* 
oupmSLg* largeYuantity of data of claim 1: wherem 
said performance measurement means carnes out perfor- 
^measurement of the disk apparatus as a part of 
SS process to the disk apparatus when disk 
apparatus are added to the system, 
g Aninput-output processing system for inputting and 

said oerforroance data collection means collects bu per 
foSnTdata by construction of input-ou^t bus 
wrdTa plurality of disk apparatus are connected £ 
lyaem instruction data to which the system manager 
gves bus transfer performance or actual o^raUonof x 
la input-output unit connected to said bus by said 
performance measurement means; 
said logical disk construction means constructs a topd 
di k apparatus on the basis of this coUected perfor- 
Sc^data considering bus transfer performance con- „ 
nected to each disk performance. 
9. An input-output processing system for inputting and 
ounXg a large quantity of data of claim 1: wherem 
in said logical disk construction means. 
L tranlr quantity for every input-output msttuc^n 60 
assigned to disk apparatus compnsing logical disk 
aSsbaaangedinone track of the disk apparatus, 
w Kut^u^ut processing system for inputting and 
ouwutt^g a large quantity of data of claim 1: wherein 
SiTogical L construction means equalizes a W arent 65 

i where *e data is placed on the disk apparatus. 



by alternate combination of inner and outer cf circum- 
faences of respective disk apparatus, if die logical ^ 
u JT^mcted bv a plurality of homogeneous 
diska^a^s whose performance is different between 
inner and outer of circumferences. 
U An input-output processing system for inputtmg and 
ouSuttSg alarge quantity of data of claim 1: wherein 
when an input-output request is sent to a logical disk 

apparatus. . 
said logical disk control means dyoamicaUy judges the 
statfof head position of disk apparatus shortly before 
sending an input-output instruction to every disk appa- 
STconstructing the logical disk apparatus, and sends 
the input-output instruction so that a time for carryuig 
JS Kt aid output to the logical disk apparatus 

12 ~t2S processing system ft. 
oumutt^g a large quantity of data of claim 
when an inpuwutput request is sent to a ogical disk 
apparatus 'usmg an input-output size i« ^ 
ifyof input-output requests are generated to each of a 
pluratity of disk apparatus; 
said logical disk control means judges performance ^char- 
acte&tics of disk apparatus 
disk apparatus and characteristics of 
which isconnccted to a disk, and decreases frequencies 
Ending a plurality of *^£%£t» 
the same disk apparatus, according to the necessity. 
13 In ouW processing system for inputting and 
J££g ^targeTXof data of claim 11: wherem 
when an inpuWutput request is sent to a log^jj* 

input-output requests are generated to each of a plu 
rality of disk apparatus; 
said logical disk control means sends the mput-ouq« 
msfrfctton so that the input-output time to me logical 
tok apparatus becomes shortest by arranging a syn- 
eSnSpoint in the input-output request to these 

h'ES^EE sys^m for inputtingand 
oumutog a targe quantity of data of claim 1; wherem 
i logicallk control means automatically constructs a 
logSisk apparatus which satisfies the performance 

ouSitt^E a large quantity of data of claun 1. further 

TdTLsfer driver for carrying out data transfer control 
to input-output units, 
said date transfer driver carries out data transfer by 
ensuring two-face buffer on the system memory, when 
an a^ms for sending the Input-output instruction 
includes a logical disk apparatus. 
16 M tput-outit processing system for mputtmgand 
oufcutthg a large quantity of data of claim IS. further 

artriter for controlling input-output bus connected to 
an input-output unit; 
«.m h«s arbiter comprises a source register and a desu- 
^natioa^ festering data tracer source ID 

a^d dautonsfer destination ID. respectively; and 
said data transfer driver carries out data transfer without 

ustoTsystem memory by driving said bus arbiter 
when an apparatus for sendiug the input-output instruc- 
tion does not include a logical disk apparatus. 
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17. An input-output processing system for inputting and 
outputting a large quantity of data of claim 15: wherein 
said data transfer driver drives said bus arbiter by ensur- 
ing buffer for data transfer on the system memory, 
when there is difference of data transfer rate between 
input units and output units which are appointed by an 
input-output instruction. 
18 An input-output processing system for inputting and 
outputting a large quantity of data of claim 1: further 
comprising 
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i plurality of disk apparatus comprising said logical disk 
apparatus which are connected to different input-output 
buses to construct a plurality of logical disk apparatus, 
further comprising 

a copy means arranged between disk apparatuses con- 
nected to the same input-output bus at a disk control 
apparatus for controlling respective disk apparatus 
connected to said respective input-output buses, in 
order to carry out data copy between different logical 
disk apparatus. 
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